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Abstract. We present a new algorithm for estimating the Personal¬ 
ized PageRank (PPR) between a source and target node on undirected 
graphs, with sublinear running-time guarantees over the worst-case choice 
of source and target nodes. Our work builds on a recent line of work on 
bidirectional estimators for PPR, which obtained sublinear running-time 
guarantees but in an average-case sense, for a uniformly random choice of 
target node. Crucially, we show how the reversibility of random walks on 
undirected networks can be exploited to convert average-case to worst- 
case guarantees. While past bidirectional methods combine forward ran¬ 
dom walks with reverse local pushes, our algorithm combines forward 
local pushes with reverse random walks. We also discuss how to modify 
our methods to estimate random-walk probabilities for any length dis¬ 
tribution, thereby obtaining fast algorithms for estimating general graph 
diffusions, including the heat kernel, on undirected networks. 


1 Introduction 

Ever since their introduction in the seminal work of Page et al. [23] , PageRank 
and Personalized PageRank (PPR) have become some of the most important 
and widely used network centrality metrics (a recent survey [13] lists several 
examples). At a high level, for any graph G, given ‘teleport’ probability a and a 
‘personalization distribution’ a over the nodes of G, PPR models the importance 
of every node from the point of view of a in terms of the stationary probabilities 
of ‘short’ random walks that periodically restart from u with probability a. It 
can be defined recursively as giving importance a to a, and in addition giving 
every node importance based on the importance of its in-neighbors. 

Formally, given normalized adjacency matrix W = D~ 1 A, the Personalized 
PageRank vector 7 r a with respect to source distribution a is the solution to 

7 T a = aa + (1 — a)ir a W. (1) 

An equivalent definition is in terms of the terminal node of a random-walk 
starting from a. Let { Xo , X\, X 2 ,...} be a random-walk starting from Xq ~ a, 
and L ~ Geometric(a). Then the PPR of any node t is given by [4]: 


7 Ta(t) = ¥[X L = t] 


( 2 ) 
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The equivalence of these definitions can be seen using a power series expansion. 

In this work, we focus on developing PPR-estimators with worst-case sublin- 
ear guarantees for undirected graphs. Apart from their technical importance, our 
results are are of practical relevance as several large-scale applications of PPR are 
based on undirected networks. For example, Facebook (which is an undirected 
social network) used Personalized PageRank for friend recommendation [5] . The 
social network Twitter is directed, but Twitter’s friend recommendation algo¬ 
rithm (Who to Follow) [16] uses an algorithm called personalized SALSA [19,6], 
which first converts the directed network into an expanded undirected graph 3 , 
and then computes PPR on this new graph. Random walks have also been used 
for collaborative filtering by the YouTube team [7] (on the undirected user-item 
bipartite graph), to predict future items a user will view. Applications like this 
motivate fast algorithms for PPR estimation on undirected graphs. 

Equations (1) and (2) suggest two natural estimation algorithms for PPR 
via linear-algebraic iterative techniques, and using Monte Carlo. The linear 
algebraic characterization of PageRank in Eqn. (1) suggests the use of power 
iteration (or other localized iterations; cf Section 1.2 for details), while Eqn. (2) 
is the basis for a Monte-Carlo algorithm, wherein we estimate n a [t] by sampling 
independent L-step paths, each starting from a random state sampled from a. 
For studying PageRank estimation algorithms, smaller probabilities are more 
difficult to estimate than large ones, so a natural parametrization is in terms 
of the minimum PageRank we want to detect. Formally, given any source a, 
target node t £ V and a desired minimum probability threshold 5, we want 
algorithms that give accurate estimates whenever TT^t] > <5. Improved algorithms 
are motivated by the slow convergence of these algorithms: both Monte Carlo 
and linear algebraic techniques have a running time of J7(l/<5) for PageRank 
estimation. Furthermore this is true not only for worst case choices of target state 
i, but on average Monte-Carlo requires 17(1 /S) time to estimate a probability of 
size 5. Power iteration takes Q{m) time, where m is the number of edges, and 
the work [21] shows empirically that the local version of power-iteration scales 
with 1/(5 for 8 > 1/m. 

In a recent line of work, linear-algebraic and Monte-Carlo techniques were 
combined to develop new bidirectional PageRank estimators FAST-PPR [22] and 
Bidirectional-PPR [20], which gave the first significant improvement in the 
running-time of PageRank estimation since the development of Monte-Carlo 
techniques. Given an arbitrary source distribution a and a uniform random 
target node t, these estimators were shown to return an accurate PageRank 

estimate with an average running-time of O 


yfd/S 


where d = m/n is the 


average degree of the graph. Given O 



precomputation and storage, 


3 Specifically, for each node u in the original graph, SALSA creates two virtual nodes, 
a “consumer-node” u! and a “producer-node” u" , which are linked by an undirected 
edge. Any directed edge ( u, v) is then converted into an undirected edge [v!. v") from 
u’s consumer node to v’s producer node. 
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the authors prove worst case guarantees for this bidirectional estimator but in 
practice that is a large precomputation requirement. This raised the challenge 
of designing an algorithm with similar running-time guarantees over a worst- 
case choice of target node t. Inspired by the bidirectional estimators in [22,20], 
we propose a new PageRank estimator for undirected graphs with worst-case 
running time guarantees. 

1.1 Our Contribution 

We present the first estimator for personalized PageRank with sublinear run¬ 
ning time in the worst case on undirected graphs. We formally present our 
Undirected-BiPPR algorithm in Section 2, and prove that it has the following 
accuracy and running-time guarantees: 

Result 1 (See Theorem 1 in Section 2) Given any undirected graph G, tele¬ 
port probability a, source node s, target node t, threshold S and relative error e, 
the Undirected-BiPPR estimator (Algorithm 2) returns an unbiased estimate 
tt s [t] for 7r s [t], which, with probability greater than 1 — pf a u, satisfies: 


|7r s [f] - 7r s [f]| < max{e7r s [f],2ed} . 


Result 2 (See Theorem 2 in Section 2) Let any undirected graph G, tele¬ 
port probability a, threshold 6 and desired relative error e be given. For any 
source, target pair ( s,t), the Undirected-BiPPR algorithm has a running-time 



where dt is the degree of the target node t. 


In personalization applications, we are often only interested in personalized im¬ 
portance scores if they are greater than global importance scores, so it is natural 
to set 5 based on the global importance of t. Assuming G is connected, in the limit 
a —> 0, the PPR vector for any start node s converges to the stationary distri¬ 
bution of infinite-length random-walks on G - that is linic^o [t] = dt/m. This 
suggests that a natural PPR significance-test is to check whether n s (t) > d t /m. 
To this end, we have the following corollary: 

Result 3 (See Corollary 1 in Section 2) For any graph G and any ( s , t) 
pair such that tt s (t) > then with high probability 4 , Undirected-BiPPR re¬ 
turns an estimate n s (t) with relative error e with a worst-case running-time of 
O (^/mlogn/e). 

Finally, in Section 3, using ideas from [8], we extend our technique to esti¬ 
mating more general random-walk transition-probabilities on undirected graphs, 
including graph diffusions and the heat kernel [11,18]. 

4 Following convention, we use w.h.p. to mean with probability greater than 1 — A 
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1.2 Existing Approaches for PageRank Estimation 


We first summarize the existing methods for PageRank estimation: 

Monte Carlo Methods: A standard method [4,9] for estimating n CT [f] is by 
using the terminal node of independently generated random walks of length 
L ~ Geometric(a ) starting from a random node sampled from a. Simple con¬ 
centration arguments show that we need 0(1 /S) samples to get an accurate 
estimate of 7r CT [i], irrespective of the choice of t and graph G. 
Linear-Algebraic Iterations: Since the PageRank vector is the stationary 
distribution of a Markov chain, it can also be estimated via forward or reverse 
power iterations. A direct power iteration is often infeasible for large graphs; in 
such cases, it is preferable to use localized power iterations [2,1]. These local- 
update methods can also be used for other transition probability estimation 
problems such as heat kernel estimation [18]. Local update algorithms are often 
fast in practice, as unlike full power iteration methods they exploit the local 
structure of the chain. However even in sparse Markov chains and for a large 
fraction of target states, their running time can be 17(1/5). For example, consider 
a random walk on a random d-regular graph and let 5 = o(l/n). Then for 
i ~ log d (l/5), verifying 7Tg [t] > <5 is equivalent to uncovering the entire log d (l/5) 
neighborhood of s. However since a large random d-regular graph is (w.h.p.) an 
expander, this neighborhood has 17(1/5) distinct nodes. 

Bidirectional Techniques: Bidirectional methods are based on simultaneously 
working forward from the source node s and backward from the target node 
t in order to improve the running-time. One example of such a bidirectional 
technique is the use of colliding random-walks to estimate length-2^ random- 
walk transition probabilities in regular undirected graphs [14,17] - the main idea 
here is to exploit the reversibility by using two independent random walks of 
length £ starting from s and t respectively, and detecting if they collide. This 
results in reducing the number of walks required by a square-root factor, based 
on an argument similar to the birthday-paradox. 

The FAST-PPR algorithm of Lofgren et al. [22] was the first bidirectional 
algorithm for estimating PPR in general graphs; this was subsequently refined 
and improved by the Bidirectional-PPR algorithm [20], and also generalized 
to other Markov chain estimation problems [8]. These algorithms are based on 
using a reverse local-update iteration from the target t (adapted from Andersen 
et al. [1]) to smear the mass over a larger target set , and then using random- 
walks from the source s to detect this target set. From a theoretical perspective, a 
significant breakthrough was in showing that for arbitrary choice of source node 

s these bidirectional algorithms achieved an average running-time of 0(^Jd/5) 
over uniform-random choice of target node t - in contrast, both local-update and 
Monte Carlo has a running-time of 17(1/5) for uniform-random targets. More 
recently, [10] showed that a similar bidirectional technique achieved a sublinear 
query-complexity for global PageRank computation, under a modified query 
model, in which all neighbors of a given node could be found in 0(1) time. 


5 


2 PageRank Estimation in Undirected Graphs 

We now present our new bidirectional algorithm for PageRank estimation in 
undirected graphs. 


2.1 Preliminaries 

We consider an undirected graph G(V,E), with n nodes and m edges. For ease 
of notation, we henceforth consider unweighted graphs, and focus on the simple 
case where a = e s for some single node s. We note however that all our results 
extend to weighted graphs and any source distribution a in a straightforward 
manner. 


2.2 A Symmetry for PPR in Undirected Graphs 

The Undirected-BiPPR Algorithm critically depends on an underlying reversibil¬ 
ity property exhibited by PPR vectors in undirected graphs. This property, stated 
before in several earlier works [3,15], is a direct consequence of the reversibility 
of random walks on undirected graphs. To keep our presentation self-contained, 
we present this property, along with a simple probabilistic proof, in the form of 
the following lemma: 

Lemma 1. Given any undirected graph G, for any teleport probability a £ (0,1) 
and for any node-pair (s,f) € V 2 , we have: 



Proof. For path P = {s, iq, V 2 , ■ ■ ■ , Vk, t} in G, we denote its length as £(P) 
(here £(P ) = k + 1), and define its reverse path to be P = {t, Vk, ■ ■ ., v^, iq, s} 
note that £{P) = £(P)■ Moreover, we know that a random-walk starting from s 
traverses path P with probability P[P] = -j- ■ j— ■ ■ -j—, and thus, it is easy 

S Vfc 

to see that we have: 

P [P] ■ d s = P[P] • d t (3) 

Now let V s t denote the set of paths in G starting at s and terminating at t. Then 
we can re-write Eqn. (2) as: 

7i- s [£] = a(l - a) f(p, P[P] = a(l ~ a) e(P) P[P] = ^--K t [s) □ 

Per st p &Vta 


2.3 The Undirected-BiPPR Algorithm 

At a high level, the Undirected-BiPPR algorithm has two components: 
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Forward-work: Starting from source s, we first use a forward local-update 
algorithm, the ApproximatePageRank(G, a , s, r max ) algorithm of Andersen et 
al. [2] (shown here as Algorithm 1). This procedure begins by placing one 
unit of “residual” probability-mass on s, then repeatedly selecting some node 
u, converting an a-fraction of the residual mass at u into probability mass, 
and pushing the remaining residual mass to it’s neighbors. For any node u, it 
returns an estimate p s [u] of its PPR 7r s [tt] from s as well as a residual r s [w] 
which represents un-puslied mass at u. 

Reverse-work: We next sample random walks of length L ~ Geometric(a) 
starting from t, and use the residual at the terminal nodes of these walks 
to compute our desired PPR estimate. Our use of random walks backwards 
from t depends critically on the symmetry in undirected graphs presented in 
Lemma 1. 

Note that this is in contrast to FAST-PPR and Bidirectional-PPR, which per¬ 
forms the local-update step in reverse from the target t , and generates random- 
walks forwards from the source s. 


Algorithm 1 ApproximatePageRank(G, a, s, r max ) [2] 

Inputs: graph G, teleport probability a, start node s, maximum residual r max 
1: Initialize (sparse) estimate-vector p s = 0 and (sparse) residual-vector r s = e s 
(i.e. r B [v] = 1 if v = s; else 0) 

2: while 3u £ V s.t. > r max do 
3: for v £ A/"[u] do 

4: r B [u] += (1 - a)r s [u]/d u 

5: end for 

6 : p s [it] += ar s [u] 

7: r B [u] = 0 

8: end while 
9: return ( p 3 ,r s ) 


In more detail, our algorithm will choose a maximum residual parameter 
r ma x, and apply the local push operation in Algorithm 1 until for all v, r s [v]/d v < 
r max . Andersen et al. [2] prove that their local-push operation preserves the 
following invariant for vectors ( p s ,r s ): 

7A [t] = Ps [t] + ^2 r s \v\k v [t ], V t £ V. (4) 

vev 

Since we ensure that \/v,r s [v]/d v < r max , it is natural at this point to use the 
symmetry Lemma 1 and re-write this as: 

7T s [f] = p B [f] + d t ^ 
v£V 

Now using the fact that = mr[t\ get that Vt € F, |7r s [£] — p s [£]| < 

r max d t mr[t\. 
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However, we can get a more accurate estimate by using the residuals. The 
key idea of our algorithm is to re-interpret this as an expectation: 



^T s [t] = Ps[t } +d t Evw t 


(5) 


We estimate the expectation using standard Monte-Carlo. Let V t ~ ir t and 
Xi = r s (Vi)dt/dvi, so we have 7r s [f] = p s [t\ + E[X]. Moreover, each sample Xi is 
bounded by dt?' max (this is the stopping condition for ApproximatePageRank), 
which allows us to efficiently estimate its expectation. To this end, we generate 
w random walks, where 


w 


■ S/d t ‘ 


The choice of c is specified in Theorem 1. Finally, we return the estimate: 



The complete pseudocode is given in Algorithm 2. 


Algorithm 2 Undirected-BiPPR(s, t, 6) 

Inputs: graph G, teleport probability a, start node s, target node t, minimum prob¬ 
ability 5, accuracy parameter c = 3 In (2/pf a n) (cf. Theorem 1) 

1: (p s ,r B ) = ApproximatePageRank(s, r max ) 

2: Set number of walks w = cdtr m ^ x /{e 2 5) 

3: for index i £ [to] do 

4: Sample a random walk starting from f, stopping after each step with probability 

a; let Vi be the endpoint 
5: Set Xi = r s {Vi)/d Vi 

6: end for 

7: return 7r s [t] = p s [t] + (1/w) X t 


2.4 Analyzing the Performance of Undirected-BiPPR 

Accuracy Analysis: We first prove that Undirected-BiPPR returns an unbi¬ 
ased estimate with the desired accuracy: 

Theorem 1. In an undirected graph G, for any source node s, minimum thresh¬ 
old 5, maximum residual r max , relative error e, and failure probability pf a u, Al¬ 
gorithm 2 outputs an estimate 7r s [f] such that with probability at least 1 — pf a a 
we have: 17r s [t] — 7r s [£] | < max{e7r s [t], 2eS}. 

The proof follows a similar outline as the proof of Theorem 1 in [20]. For 
completeness, we sketch the proof here: 
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Proof. As stated in Algorithm 2, we average over w = cdtr max /e 2 <5 walks, where 
c is a parameter we choose later. Each walk is of length Geometric(a ), and we 
denote V t as the last node visited by the i th walk; note that V) ~ w t . As defined 
above, let A/ = r s (Vi)dt / dyp, the estimate returned by Undirected-BiPPR is: 

1 w 

7T s [t\ =Pt[s ] + — ^ Xj. 


First, from Eqn. (5), we have that E[7? s [f]] = 7r s [t]. Also, ApproximatePageRank 
guarantees that for all v, r s [u] < <i„r max , and so each X; is bounded in [0, dtr max ]; 
for convenience, we rescale by defining Y t = , 1 — Xj. 

We now show concentration of the estimates via the following Chernoff 
bounds (see Theorem 1.1 in [12]): 

1. P[|y -E[y]| > eE[y]] < 2exp(-^E[y]) 

2. For any b > 2eE[Y],P[F > 6] < 2~ b 

We perform a case analysis based on whether E[A.,] > <5 or E[A,] < S. First, if 
E[JQ] > <5, then we have E[y] = —E[A/] = jf^E [Xi\ > -p, and thus: 

P [|Sr B [i] - 7r a [t]| > £7r s [t]] < P [\X - E[W]| > eE[Xj] = P [|F - E[F]| > eE[E]] 

< 2exp ^-yE[F]^ < 2exp < p fai i, 

where the last line holds as long as we choose c > 3 In (2/pf ai i). 

Suppose alternatively that E[A/] < S. Then: 


P[|7r s [f] — 7r s [t]| > 2eS] = P[|X — E[JAj]| > 2 eS] 


< P 


Y > 


w 


dtT max 


2e<5 


|y-E[y]| > -r^—2e6 

dt T max 


At this point we set b = 2eSw / d t r max = 2 ec/e 2 and apply the second Chernoff 
bound. Note that E[y] = c&[Xj\/e 2 5 < c/e 2 , and hence we satisfy b > 2eE[y]. 
We conclude that: 


P[]t r a [t] - 7r s [£] | > 2 eS] <2 b < p fai i 
2 

as long as we choose c such that c > log 2 . The proof is completed by 
combining both cases and choosing c = 3 In (2/pf a u). □ 

Running Time Analysis: The more interesting analysis is that of the running¬ 
time of Undirected-BiPPR we now prove a worst-case running-time bound: 


Theorem 2. In an undirected graph, for any source node (or distribution) s, 
target t with degree d t , threshold 5, maximum residual r max , relative error e, and 
failure probability pf a u, Undirected-BiPPR has a worst-case running-time of: 



O 
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Before proving this result, we first state and prove a crucial lemma from [2]: 

Lemma 2 (Lemma 2 in [2]). LetT be the total number of push operations per¬ 
formed by ApproximatePageRank, and let dk be the degree of the vertex involved 
in the k th push. Then: 


J2 d k < 


* LbJ/ 

Proof. Let Vk be the vertex pushed in the k th step - then by definition, we have 
that r s {vk) > r max dfc. Now after the local-push operation, the sum residual ||r s ||i 
decreases by at least or max c4. However, we started with ||r s ||i = 1, and thus we 
have J2k =1 ar maxdfc <1- □ 

Note also that the amount of work done while pushing from a node v is d v . 

Proof (of Theorem 2). As proven in Lemma 2, the push forward step takes total 
time O (l/ar m ax) in the worst-case. The random walks take 0(w) = O 
time. Thus our total time is 


O 


In -P- r ' 

Pfaii max 


Balancing this by choosing 


T max — rr 


1 S/dt J ■ 

-s/TTd t , we eet total 


O 



□ 


We can get a cleaner worst-case running time bound if we make a natural as¬ 
sumption on 7r s [f]. In an undirected graph, if we let a = 0 and take infinitely long 
walks, the stationary probability of being at any node t is Thus if 7r s [t] < . 

then s actually has a lower PPR to t than the non-personalized stationary prob¬ 
ability of t, so it is natural to say t is not significant for s. If we set a significance 
threshold of <5 = —, and apply the previous theorem, we immediately get the 
following: 

Corollary 1 . If n s [t] > —, we can estimate 7r s [t] within relative error e with 
probability greater than 1 — ^ in worst-case time: 



In contrast, the running time for Monte-Carlo to achieve the same accuracy 
guarantee is O lu s', J Tf^ 1 ) j ^ an j runn j n g time for ApproximatePageRank 

is ^ ( JR ) • Th® FAST-PPR algorithm of [22] has an average case running time of 
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O ^/ 10 g iog(i/ i (i-a)) / '' 5 ' 1 "^ f° r un if° rm ly chosen targets, but has no clean 

worst-case running time bound because its running time depends on the degree 
of nodes pushed from in the linear-algebraic part of the algorithm. 

3 Extension to Graph Diffusions 

PageRank and Personalized PageRank are a special case of a more general set 
of network-centrality metrics referred to as graph diffusions [11,18]. In a a graph 
diffusion we assign a weight a, to walks of length i. The score is then is a 
polynomial function of the random-walk transition probabilities of the form: 

OO 

f{W,a) (aW*), 

i=o 

where oti > 0, Y2l = 1. To see that PageRank has this form, we can expand 
Eqn. (1) via a Taylor series to get: 

OO 

ir a = ^ a(l — a ) 1 (<rW l ) 

i=1 

Another important graph diffusion is the heat kernel h Gl which corresponds to 
the scaled matrix exponent of (/ — W) _1 : 

OO _ ,y j 

h a „ = e —^~ {oW*) 

i—1 

In [8], Banerjee and Lofgren extended Bidirectional-PPR to get bidirectional 
estimators for graph diffusions and other general Markov chain transition-probability 
estimation problems. These algorithms inherited similar performance guarantees 
to Bidirectional-PPR in particular, they had good expected running-time 
bounds for uniform-random choice of target node t. We now briefly discuss how 
we can modify Undirected-BiPPR to get an estimator for graph diffusions in 
undirected graphs with worst-case running-time bounds. 

First, we observe that Lemma 1 extends to all graph diffusions, as follows: 

Corollary 2. Let any undirected graph G with random-walk matrix W, and 
any set of non-negative length weights (oi)fco w ^ 12 a * = 1 be given. Define 
f(W., <j) = ai { a W l ). Then for any node-pair (s, t) € V 2 , we have: 

m,e s ) = |/ ( W,e t ). 

As before, the above result is stated for unweighted graphs, but it also extends 
to random-walks on weighted undirected graphs, if we define di = Y2j w ij- 

Next, observe that for any graph diffusion /(•), the truncated sum / fmax = 
Y2i=o* Ui (nlP 1 ) obeys: ||/ - / fmax || 00 < a k Thus a guarantee on an 
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estimate for the truncated sum directly translates to a guarantee on the estimate 
for the diffusion. 

The main idea in [8] is to generalize the bidirectional estimators for PageRank 
to estimating multi-step transitions probabilities (for short, MSTP). Given a 
source node s, a target node t, and length £ < £ max , we define: 

p e s [t] = P[Random-walk of length £ starting from s terminates at f] 

Note from Corollary 2, we have for any pair (s,t) and any £, p £ s [t]d s = p\{s\d t - 

Now in order to develop a bidirectional estimator for pl[t\, we need to define 
a local-update step similar to ApproximatePageRank. For this, we can modify 
the REVERSE-PUSH algorithm from [8], as follows. 

Similar to ApproximatePageRank, given a source node s and maximum length 
Imaxi we associate with each length £ < £ max an estimate vector q e s and a residual 
vector rf. These are updated via the following ApproximateMSTP algorithm: 


Algorithm 3 ApproximateMSTP (G, s, £ max , r max ) 

Inputs: Graph G, source s, maximum steps £ max , maximum residual r max 
1: Initialize: Estimate-vectors gj = 0, V k £ {0,1,2,... , £ max }, 

Residual-vectors = e s and rj = 0 , V k € {1, 2,3,..., £ max } 
2: for i € {0,1,, £ max } do 
3: while 3 v € S s.t. rl[v\/d v > T max do 

4: for w G N(v) do 

5: r* +1 [w] += r\[v\/d v 

6: end for 

7: q\[v\+= r l s [v\ 

8 : r*[uj=0 

9: end while 

10: end for 
11: return 


The main observation now is that for any source s, target t , and length £, 
after executing the ApproximateMSTP algorithm, the vectors {vf, rf}^” q x satisfy 
the following invariant (via a similar argument as in [8], Lemma 1): 




k= 0 v£V 


dM + dtEE 

k—0v£V 



As before, note now that the last term can be written as an expectation over 
random-walks originating from t. The remaining algorithm, accuracy analysis, 
and runtime analysis follow the same lines as those in Section 2. 


4 Lower Bound 

In [22], the authors prove an average case lower bound for PPR-Estimation. In 
particular they prove that there exists a family of undirected 3-regular graphs 
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for which any algorithm that can distinguish between pairs (s,t) with 7r s [£] > 6 
and pairs (s,t) with 7r s [£] < | (distinguishing correctly with constant probability 
8/9), must access f2(l/y/S) edges of the graph. Since the algorithms in [22,20] 

solve this problem in time O ^\J d/S'j , where d is the average degree of the 

given graph, there remains a y/d gap between the lower bound and the best 
algorithm for the average case. For the worst case (possibly parameterized by 
some property of the graph or target node), the authors are unaware of any lower 
bound stronger than this average case bound, and an interesting open question 
is to prove a lower bound for the worst case. 

5 Conclusion 

We have developed Undirected-BiPPR, a new bidirectional PPR-estimator for 
undirected graphs, which for any (s,t) pair such that 7r s [f] > dt/m, returns 
an estimate with e relative-error in worst-case running time of 0(y/m/e). This 
thus extends the average-case running-time improvements achieved in [22,20] to 
worst-case bounds on undirected graphs, using the reversibility of random-walks 
on undirected graphs. Whether such worst-case running-time results extend to 
general graphs, or if PageRank computation is fundamentally easier on undi¬ 
rected graphs as opposed to directed graphs, remains an open question. 
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